-
Notifications
You must be signed in to change notification settings - Fork 23
[DPE-7726] Use Patroni API for is_restart_pending() (instead of SQL select from pg_settings) #1049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 16/edge
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (29.16%) is below the target coverage (33.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## 16/edge #1049 +/- ##
===========================================
- Coverage 64.87% 62.40% -2.47%
===========================================
Files 17 17
Lines 4270 4272 +2
Branches 656 655 -1
===========================================
- Hits 2770 2666 -104
- Misses 1333 1440 +107
+ Partials 167 166 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
9c17d76
to
e31de74
Compare
The previous is_restart_pending() waited for long due to the Patroni's loop_wait default value (10 seconds), which tells how much time Patroni will wait before checking the configuration file again to reload it. Instead of checking PostgreSQL pending_restart from pg_settings, let's check Patroni API pending_restart=True flag.
The current Patroni 3.2.2 has wired/flickering behaviour: it temporary flag pending_restart=True on many changes to REST API, which is gone within a second but long enough to be cougth by charm. Sleepping a bit is a necessary evil, until Patroni 3.3.0 upgrade. The previous code sleept for 15 seconds waiting for pg_settings update. Also, the unnecessary restarts could be triggered by missmatch of Patroni config file and in-memory changes coming from REST API, e.g. the slots were undefined in yaml file but set as an empty JSON {} => None. Updating the default template to match the default API PATCHes and avoid restarts.
On topology observer event, the primary unit used to loose Primarly label.
e31de74
to
1703639
Compare
Also: * use commong logger everywhere * and add several useful log messaged (e.g. DB connection) * remove no longer necessary debug 'Init class PostgreSQL' * align Patroni API requests style everhywhere * add Patroni API duration to debug logs
The list of IPs were randomly sorted causing unnecessary Partroni configuration re-generation with following Patroni restart/reload.
Housekeeping cleanup.
…hanged Those defers are necessary to support scale-up/scale-down during the refresh, while they have significalty slowdown PostgreSQL 16 bootstrap (and other daily related mainteinance tasks, like re-scaling, full node reboot/recovery, etc). Muting them for now with the proper documentation record to forbid rescaling during the refresh, untli we minimise amount of defers in PG16. Throw and warning for us to recall this promiss.
The current PG16 logic relies on Juju update-status or on_topology_change observer events, while in some cases we start Patroni without the Observer, causing a long waiting story till the next update-status arrives.
It is hard (impossible?) to catch the Juju Primary label manipulations from Juju debug-log. Logging it simplifyies troubleshooting.
We had to wait 30 seconds in case of lack of connection which is unnecessary long. Also, add details for the reason of failed connection Retry/CannotConnect.
It speedups the sinble unit app deployments.
1703639
to
ee8e44b
Compare
Issue
The previous is_restart_pending() waited for 15 seconds due to the
Patroni's loop_wait default value (10 seconds), which tells how much time
Patroni will wait before checking the configuration file again to reload it.
Solution
Instead of checking PostgreSQL pending_restart from pg_settings,
check Patroni API pending_restart=True/undefined.
Checklist